A Query Processing Framework for Array-Based Computations
نویسنده
چکیده
Current scientific applications must analyze enormous amounts of array data using complex mathematical data processing methods. This paper describes a distributed query processing framework for large-scale scientific data analysis that captures array-based computations using SQL-like queries and optimizes and evaluates these computations using state-of-the-art parallel processing algorithms. Instead of providing a library of concrete distributed algorithms that implement certain matrix operations efficiently, we generalize these algorithms by making them parametric in such a way that the same efficient implementations that apply to the concrete algorithms can also apply to their generic counterparts. By specifying matrix operations as generic algebraic operators, we are able to perform inter-operator optimizations, such as fusing matrix transpose with matrix multiplication, resulting to new instantiations of the generic algebraic operators, without having to introduce new efficient algorithms on the fly. We evaluate the effectiveness of our framework by measuring the performance improvement of matrix factorization when evaluated with inter-operator optimization.
منابع مشابه
Numerical Simulation of a Lead-Acid Battery Discharge Process using a Developed Framework on Graphic Processing Units
In the present work, a framework is developed for implementation of finite difference schemes on Graphic Processing Units (GPU). The framework is developed using the CUDA language and C++ template meta-programming techniques. The framework is also applicable for other numerical methods which can be represented similar to finite difference schemes such as finite volume methods on structured grid...
متن کاملINformation Systems RAM: Array processing over a relational DBMS
Developing multimedia applications in relational databases is hindered by a mismatch in computational frameworks. Efficient manipulation of multimedia data calls for array-based processing, which at best is available as a database add-on, not supported by the query optimizer. As a result, array-based processing ends up in dedicated programs outside the DBMS: non-reusable black boxes. The goal o...
متن کاملRAM: Array Processing over a Relational DBMS
Developing multimedia applications in relational databases is hindered by a mismatch in computational frameworks. Efficient manipulation of multimedia data calls for array-based processing, which at best is available as a database add-on, not supported by the query optimizer. As a result, array-based processing ends up in dedicated programs outside the DBMS: non-reusable black boxes. The goal o...
متن کاملAutomaton Meets Algebra: A Hybrid Paradigm for Efficiently Processing XQuery over XML Stream
XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data streams. The automaton paradigm is naturally suited for pattern retrieval on tokenized XML streams, but requires patches for implementing the filtering or restructuring functionalities common for the XML query languages. In contrast, the algebraic paradigm is well-establishe...
متن کاملAutomaton meets algebra: A hybrid paradigm for XML stream processing
XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data streams. The automata paradigm is naturally suited for pattern recognition on tokenized XML streams, but requires patches for fulfilling the filtering or restructuring functionalities in the XML query language. In contrast, the algebraic paradigm is a well-established techni...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016